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Abstract 

A meta-analysis (37 published studies) which summarized treatment outcomes associated with 
skills training with antisocial youth was performed. Consistent with our hypothesis, results 
indicated that skills training interventions delivered in the context of homogeneous groups of 
deviant peers produced smaller benefits than did skills training interventions delivered in the 
context of mixed groups of prosocial and deviant peers, or individual treatment. Also, as expected, 
treatment provided in the context of deviant-only groups attenuated treatment benefits more for 
more severely disordered youth such as those who are incarcerated or placed in a class for 
behavioral or emotional problems, than for youth who might only be at-risk for such conditions. 
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Impact of Modality on Skills Training for Youth with Externalizing Problems: 

A Meta- Analysis 

Research on interventions designed to reduce childhood aggression and prevent adolescent 
delinquency and substance abuse has expanded rapidly in the past decade. One of the most popular 
intervention methods in schools and clinics is group-based skills training in which a group of youth 
with conduct problems are taught a set of social or problem-solving skills to help them better 
negotiate problem situations without using aggressive means (Kazdin, 1997). Group-based skills 
training is popular in part because of its ease and efficiency of administration. Whereas some group- 
based interventions have documented positive outcomes, others have failed to do so, and some 
interventions result in detrimental outcomes (see Arnold & Hughes, in press). 

In a meta-analytic investigation of group social skills training with children experiencing a 
range of problems, Beelmann, Pfinsten, and Losel (1994) found that group-based skills training 
resulted in modest short-term gains but that long-term benefits were generally lacking. Also, short- 
term gains varied as a function of assessment method, with larger gains for measures of targeted 
skills and smaller gains on measures of socially consequential outcomes. Because the Beelmann et 
al. meta-analysis included children with problems other than externalizing disorders, their findings 
may be more or less characteristic of skills training with antisocial 1 youth. A quantitative summary 
of the effects of skills training with deviant youth would assist in evaluating the efficacy of this 
intervention modality with this population. In particular, it is important to determine if intervention 
effectiveness differs based on characteristics of participating youth and the training program. 

In a narrative review of group-based skills training with aggressive youth, Arnold and 
Hughes (in press) argue that whether or not skills training occurs within the context of deviant-only 
groups may affect its benefits. Specifically, they suggest that the expected benefits of skills training 
with this population may be diluted as a result of unintended negative effects of aggregating deviant 
youth for purposes of skills training. Arnold and Hughes (in press) underscored the necessity of 
systematic research on the effect of grouping deviant peers for skills training interventions in order 
to test the hypothesis that aggregating delinquent youth attenuates treatment gains. They 
recommended a meta-analytic investigation of the possible moderating role of group composition 
(i.e., deviant-only treatment versus non-aggregated treatment) on the effectiveness of skills training 
with aggressive youth. This article attempts to determine via a comprehensive review of all 
controlled outcome studies to date of social skills interventions with antisocial youth, if positive 
treatment effects are greater for skills training interventions that provide individualized treatment or 
treatment in groups comprised of both prosocial and aggressive peers, than are treatment effects for 
skills training in interventions that provide treatment in homogeneous groups of antisocial children. 
We were also interested in investigating whether selected client characteristics moderated the 
effectiveness of skills training and whether client characteristics exerted such an influence 
differently based on group modality. Specifically, we expected that deviant-only group treatment 
would be less successful with youth whose antisocial behaviors were less severe, versus youth with 
more severe antisocial behavior. This expectation was based on a finding that moderately 
aggressive boys are most susceptible to the deleterious influence of aggressive friends (Vitaro, 
Tremblay, Kerr, Pagani, & Bukowski, 1997). We also expected that age might be a moderating 
variable such that treatment modality would not account for differential treatment benefit for older 



1 Throughout this manuscript we use the terms “antisocial” and “deviant” to refer to youth with a range of externalizing 
behavior problems including aggression, defiance, stealing, and lying; when discussing specific studies, we use terms 
that characterize the subjects in the particular studies. 
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youth (age 13-18). We expected that by mid-adolescence, affiliation patterns and identification with 
conforming or non-conforming peer groups would be more stable; consequently, mixing procial 
peers and antisocial peers for treatment would have little impact on youth’s peer networks or 
attitudes toward conventional versus antisocial behavior. 

With the development of meta-analytic techniques, it is now possible to integrate findings 
across multiple studies and to systematically compare findings across dimensions such as outcome 
type, for example. The basis for analysis is the effect size, which is an estimate of the magnitude of 
the treatment effect adjusted for sample variability. The aggregation of results from different studies 
is a major advantage of meta-analysis over the traditional narrative literature review. A more direct 
statistical comparison of studies can be made with more control over possible bias inherent in the 
narrative review. Schmidt (1992, 1996) pointed to another advantage of a meta-analytic review: 
statistical significance testing for interpreting the data no longer plays such a dominant role. 
Statistical significance testing can lead researchers to mistakenly conclude that no relationship 
exists between two variables of interest (Type II error). One limitation of the meta-analytic 
approach however, is that many studies do not provide sufficient information or information of 
sufficient detail to permit inclusion in a review (Lipsey & Wilson, 1993). 

Method 

Selection of Studies 

This review was restricted to published studies based on an assumption that published 
research undergo a review process that controls for the quality of research studies. We defined 
social skills training as behavioral and/or cognitive interventions that were explicitly directed 
toward training or modifying cognitive (e.g., problem-solving skills) and/or affective (e.g., anger 
control) components of social behavior. Other criteria for inclusion of a study in the review were as 
follows: 

1 . Selection was based on mean age of total sample or reported school grade. Subjects were 
between 6-to-18 years of age, or were in school grades ranging from 1 through 12. 

2. Subjects with described as having externalizing behavior problems including (a) childhood 
aggression, (b) conduct disorder, (c) oppositional defiant disorder, (d) antisocial behavior, (e) 
violent behavior, or (f) adolescent delinquency. Studies with subjects described as hyperactive 
or experiencing peer rejection were excluded in this review unless the subjects also presented 
with conduct problems. 

3. Studies used an experimental or quasi-experimental design with at least one control group. 
Studies that included group comparisons with only a nondeviant control group, and single group 
designs were excluded. 

4. Studies involved group or individual treatment. Group composition could either be deviant-only 
peers, or mixed groups of deviant and prosocial peers. 

5. Outcome assessment reported quantifiable measures of social or behavioral adjustment. 

6. Studies were published between mid-1970s through 1997 in English. 

Literature Search Procedure 

With the selection criteria listed in the preceding section, a total of 37 studies (5 individual, 

5 mixed-group, and 27 deviant-group) were identified (marked with an asterisk in the Appendix). 
The studies were identified through PsycINFO and ERIC databases which index academic and 
professional literature in psychology and education, as well as related disciplines such as psychiatry, 
medicine, nursing, and sociology. Using a computer search, the following keywords were used in 
various combinations: social skills training, problem solving skills training, cognitive therapy, 
behavioral therapy, cognitive behavioral therapy, group therapy, group intervention, conduct 
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disorder, aggression, delinquency, antisocial, violence, deviant, and prosocial. In addition, the 
references from these identified studies were inspected to locate studies appearing in other 
publications. 

Coding Procedure 

Each study was coded with respect to subject, treatment, and outcome measure 
characteristics. Most classifications were straightforward but a few warrant explanation. We wanted 
to assess how treatment modality and type of population relate to treatment outcome. Treatment 
modality was coded as deviant-only group treatment, individual treatment, or mixed (deviant and 
prosocial) group treatment. In order to assess the magnitude of treatment effects associated with 
homogeneous group treatment of deviant children versus either individual treatment or treatment of 
children in mixed deviant and prosocial groups, we collapsed the individual treatment and mixed 
group treatment into a single classification (non-aggregated treatment) prior to conducting our 
analyses. Collapsing across studies evaluating individual and mixed group treatments was also 
necessary in order to have reasonable power to detect differences based on treatment modality. 
Population type was classified as either preventive or clinical. The preventive population included 
children and youth that manifested aggressive, disruptive and delinquent behaviors which have not 
been clinically diagnosed, and who are not incarcerated or institutionalized. The clinical population 
included aggressive children and youth that have either been diagnosed as conduct disordered or are 
juvenile offenders that have been incarcerated or institutionalized in a psychiatric treatment facility. 
Type of outcome measures was classified into five categories: (a) behavior rating scales (e.g., 
teacher or parent rating scales), (b) behavior observations (e.g., time-sampled ratings of children’s 
behavior and role-play performance) (c) self-report (e.g., measures assessing self-esteem) (d) 
problem-solving skills (e.g., measures assessing children’s problem-solving skills via the 
presentation of hypothetical situations or vignettes), and (e) socially consequential measures (e.g., 
sociometrics, recidivism). Several methods by which children were referred included being in a 
special program (e.g., The Think First Program), the use of rating scales, behavioral indicators, 
teacher nomination, teacher nomination plus the use of rating scales, and teacher and peer ratings. 

Interrater Agreement 

The first author performed the coding for all studies. Eighteen studies (48.6%) were 
randomly selected and coded by another author to test inter-rater agreement. There was perfect 
agreement for treatment modality (k= 1 .00) and type of population (k = 1 .00). The two raters 
achieved kappas of .79 for type of outcome measures and .90 for method of referral. Disagreements 
in coding between the raters were resolved through discussion. 

Estimating Treatment Effects 

Effect sizes were estimated using procedures suggested by Glass, McGaw, and Smith 
(1981). In each calculation, effect size was computed as treatment group mean minus control group 
mean divided by the control group standard deviation. We used the control group standard deviation 
instead of the pooled within-group standard deviation as the denominator because we agree with 
Bergin and Lambert (1978) that one consequence of therapy is an increase in behavioral variability. 
Thus, use of the pooled within-group standard deviation may cause statistical and interpretational 
problems (Smith, Glass, & Miller, 1980) which we attempted to avoid. 

For some outcome measures, higher numbers indicated greater improvement but for other 
measures lower numbers indicated greater improvement. Effect sizes were calculated in a consistent 
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manner such that positive scores indicate that the treatment group improved more than the control 
group and negative scores indicate that the control group improved more than the treatment group. 

Most effect sizes were calculated from means and standard deviations or raw data reported 
in the study. When this information was unavailable, effect size was estimated from other reported 
statistics (e.g., t, F, or chi-square). Various techniques for estimating such effect sizes are reported 
in previous work (see Glass et al., 1981, chap. 5). 

In some studies, findings for all outcome measures that had been used were either not 
reported or reported as nonsignificant. In such cases, an exact effect size cannot be computed. 
However, not including these outcome measures in the analyses would have artificially inflated the 
overall estimate of effect size because researchers were more likely to provide complete or adequate 
information on those measures that demonstrated statistically significant or large treatment effects. 
Thus, when the results from an outcome measure were not reported or were reported only as 
nonsignificant, we conservatively estimated the effect size to be zero. 

Most studies compared treatments on more than one outcome measure. Because multiple 
effect sizes derived from the same study may not represent statistically independent observations, 
an analysis based on such nonindependent observations can underestimate error variance and inflate 
tests of statistical significance (see Glass et al., 1981, chap. 6). To avoid this problem of 
nonindependence of observations, we averaged multiple effect sizes obtained from individual 
measures within the same treatment comparison. However, separate means for different types of 
outcome measures were also calculated because we wished to assess treatment effects associated 
with particular types of outcome measures. We calculated separate means for five types of outcome 
measures: behavior ratings, behavior observations, self-report, problem-solving skills, and socially 
consequential measures. Multiple effect sizes obtained for each outcome measure type (e.g., 
behavior ratings) within a study were averaged to obtain a single effect size for that outcome 
measure type. For example, if behavior ratings yielded three effect sizes within a study, these effect 
sizes would be averaged to yield a single effect size value associated with behavior ratings for that 
study. 

Several studies reported the results of more than one treatment comparison. This problem of 
nonindependence was treated in a manner similar to that used for multiple outcome measures. For 
example, if a study yielded two different comparisons of treatment (e.g., social skills training and 
social skills training with in vivo practice) to no treatment, we averaged the results of these 
comparisons to get a single value for the treatment group to be used in the analyses. 

Another issue we encountered was the use of multiple control groups. Some studies used a 
single control group whereas others utilized two types of control groups (e.g., attention-placebo, no 
treatment control). In studies where more than one control group was used, we used the more 
stringent control group as our control comparison against the treatment comparison. For example, if 
a study had three groups, social skills training group, attention-control, and no treatment control, the 
more stringent attention-control group would serve as our control comparison. 

In a typical meta-analysis, an index of effect size is used to summarize the results of each 
study, and effect size indices may then be averaged to obtain an overall estimate of effect 
magnitude. Conventional statistical methods such as analysis of variance (ANOVA) and multiple 
regression are then used to study the variation in the effects across studies. Hedges and Becker 
(1986) argued that conventional analyses frequently involve serious violations of the assumptions of 
these techniques, and have demonstrated that statistical methods they developed overcome both 
conceptual and statistical problems posed by conventional statistical analyses of effect sizes. 
Conceptually, conventional analysis lacks the ability to test the consistency of effect sizes across 
studies. This limitation is important because combining effect sizes across studies makes sense only 
if the studies have a common population effect size. Consequently, it is impossible to construct a 
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test for whether the systematic variation in the effect sizes is larger than the nonsystematic variation 
exhibited by those effect sizes. Conventional methods are also problematic for statistical reasons. 
Conventional statistical procedures (e.g., ANOVA, multiple regression analysis) rely on parametric 
assumptions about the data that are not fulfilled for effect size data. These procedures require that 
the unsystematic variance associated with every observation is the same. The unsystematic variance 
of estimates of effect size is proportional to 1/n, where n is the sample size of the study on which 
the estimate is based. Therefore, if the sample sizes of the studies vary widely, which is usually the 
case, the effect size estimates will have different error variances. The F-test is not necessarily robust 
to severe violations of the homogeneity-of-variance assumption in ANOVA and regression 
analyses. 

Hedges and Becker (1986) developed new statistical procedures to overcome the conceptual 
and statistical problems associated with conventional analytic procedures. Hedges’ (1981) 
correction factor is applied to the effect size estimate obtained via the procedure outlined by Glass 
et al. (1981) to yield the unbiased estimator d. The variance of d is completely determined by 
sample sizes and the value of d. Therefore, it is possible to determine the sampling variance of d 
from a single observation. This ability to determine the non-systematic variance of d from a single 
observation of d is the crux of modem statistical methods for meta-analysis. Refer to Hedges and 
Becker (1986) for a comprehensive review and demonstration of these techniques. The authors used 
Hedges and Becker’s techniques in the computation of the unbiased estimator d and in all effect size 
analyses. 



Results 

Sample and Treatment Characteristics 

Five studies were classified as individual treatment, five studies were classified as mixed 
group treatment, and 27 studies were classified as deviant-only treatment. For posttreatment data, 

1 1 effect sizes were obtained from individual treatment studies, 14 effect sizes were obtained from 
mixed group treatment studies, and 72 effect sizes were obtained from deviant-only group treatment 
studies. The mean age of the subjects was 1 1 .54 years (SD = 2.84; range = 6.0 to 18.1). On average, 
85% of the youngsters sampled were male. Of the 37 studies, 15 did not report information 
regarding ethnicity. The remaining studies had ethnicity breakdowns with the following mean 
percentages: 47% Anglo-Americans, 51.9% African-Americans, 1.8% Hispanics, and 1.4% Others. 
Across studies, the mean number of sessions per week was 1 .83 (SD = 1 .03; range = 1 to 5) and the 
mean number of minutes per session was 59.16 (SD = 30.41 ; range = 20 to 1 80). The average 
treatment duration in weeks was 13.26 (SD = 17.09; range = 3 to 104). Eighteen studies (48.6%) 
provided information on follow-up treatment, and 19 studies (51.4%) did not have follow-up. For 
the 18 studies that had follow-up treatment, the mean length of time between posttreatment and 
follow-up in months was 5.18 (SD = 9. 1 8; range = 1 to 36). 

Therapist Training 

Ten studies did not provide information (27.0%) on the experience level of the therapist. Of 
the 27 studies (73.0%) that did report such information, therapy was provided by professionals in 19 
studies (50%), graduate students in 7 studies (18.4%), and university professors in 1 study (2.6%). 

Testing Homogeneity of Effect Size 

The unbiased estimator d as described previously, is based upon effect size estimates of 
independent samples. A weighted average D is a precise combination of values of d that takes into 
account the variances of d. Such a combination of effect sizes across studies makes sense only if the 
studies shared a common underlying population effect size. A test of the homogeneity of effect size 
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involves the computation of the Qt statistic, which is the weighted sum of squares of effect sizes 
estimates about the weighted mean D. If all the studies share a common underlying population 
effect size, then Qt has approximately a chi-square distribution with k-1 degrees of freedom. 
However, if sufficient heterogeneity exists, then Qt will tend to be larger than expected by chance. 
Thus the test for the homogeneity of effect size will be rejected at the significance level a if Q T 
exceeds the 100(1 - a) percent critical value of the chi-square distribution with k-1 degrees of 
freedom. The overall effect size across the 37 studies was 0.49. Because the homogeneity test 
revealed that the effect sizes across the studies differed beyond chance (Qt = 72.90, g < .05), any 
attempt to interpret the overall average effect size may be misleading, and hence an investigation of 
factors that may moderate effect size is warranted. 

Effect of Treatment Modality on Outcome Measures 

The authors sought to determine the effect of modality (individual and/or mixed versus 
deviant-only) on various outcome variables using the procedures outlined by Hedges and Becker 
(1986). This procedure is an analogue to ANOVA for effect sizes which permits the authors to test 
the significance of variation between groups of effect sizes, and to test if the remaining variation 
within groups of effect sizes is significant. The ANOVA for effect sizes involves partitioning the 
overall homogeneity statistic Qt into the between-group homogeneity (Qb) and the within-group 
homogeneity (Qw)- The between-group homogeneity statistic Qb is analogous to the F statistic for 
testing for between-group differences in a conventional ANOVA. When there are g groups the 
statistic Qb has approximately a chi-square distribution with g-1 degrees of freedom. When testing 
for between-group differences, Qb is compared with the 100(1 - a) percent critical value of the chi- 
square distribution with g-1 degrees of freedom, and if Qb exceeds the critical value, the between- 
group difference is significant at level a. The within-group homogeneity statistic Q w is the sum of 
the homogeneity statistics calculated for each of the g groups as if each group were an entire 
collection of studies. 

There are two sub-categories in the non-aggregated treatment modality classification, 
individual treatment and mixed (deviant and prosocial) group treatment. The mean effect size for 
the five individual treatment studies (0.72) and the five mixed group treatment studies (0.67) did not 
differ from each other statistically, t(20) = 0.32, ns. Because the effect sizes for these two categories 
were comparable, they were collapsed into a single classification (non-aggregated treatment) for the 
analyses. 

Modality (non-aggregated versus deviant-only) across all outcome variables was found to be 
statistically significant (Qb = 27.57, g < .05). Table 1 reports effect sizes by treatment modality for 
all effects. The pattern of means was consistent with the hypothesis; the means were 0.42 and 0.69 
in the deviant-only group and the non-aggregated group, respectively. We next tested whether the 
advantage of non-aggregated grouping was consistent across different types of outcome measures. 
Modality was found to be statistically significant for posttreatment outcome variable behavior 
observations (Q B = 17.69, g < .05). The mean for the deviant-only group was 0.39 and the mean for 
the non-aggregated group was 0.80. The mean effect size for the deviant-only group (0.14) differed 
significantly from the mean effect size for the non-aggregated group (0.66) for outcome variable 
problem solving skills, (Qb = 17.87, g < .05). Similarly, the effect of modality was found to be 
statistically significant for socially consequential outcome measures (Qb = 8.06, g < .05). Once 
again, the pattern of means was in a direction consistent with that of the hypothesis; the means were 
0.95 and 1.34 for deviant-only and non-aggregated group respectively. Differences between 
deviant-only modality and the non-aggregated modality for outcome variable categories behavior 




Impact of Modality on Skills Training 9 



ratings and self-report did not reach statistical significance at a = .05 but were in the same direction 
as the overall findings. 

In addition to examining the impact of modality on posttreatment effect sizes, we conducted 
similar analyses for 38 follow-up effect sizes which were obtained from the studies reporting 
follow-up data. The overall effect size for follow-up data was found to be 0.30. Since the test of 
homogeneity revealed that sufficient heterogeneity existed among the effect sizes (Qj = 55.81, g < 
.05), the overall effect size is therefore an inadequate summary index. Once again, we examined 
moderating factors of these effect sizes. Modality (non-aggregated versus deviant-only) across all 
outcome variables was found to be statistically significant (Qb = 15.89, p < .05). The pattern of 
means was consistent with the hypothesis; the means were 0.24 and 0.51 in the deviant-only group 
and the non-aggregated group, respectively. We could not conduct further analyses on individual 
outcome variables as was performed for posttreatment data because of insufficient sample size. 

Client Characteristics That Moderate Effect Sizes 

Furthermore, we also wanted to examine, within two types of population (preventive versus 
clinical), if individuals who received non-aggregated treatment had larger mean effect sizes than 
individuals who received deviant-only group treatment. We predicted that the participants in the 
non-aggregated treatment format would have larger mean effect sizes compared to participants in 
the deviant-only group format, only within the preventive population. The preventive group (n = 27) 
included aggressive youth who have not been clinically diagnosed or incarcerated, and the clinical 
group (n = 1 0) included aggressive youth who have been clinically diagnosed or incarcerated. The 
overall effect sizes for the preventive and clinical groups are 0.43 and 0.65 respectively. There was 
heterogeneity in effect sizes of the preventive group (Qt = 61.41, g < .05) and therefore, an 
investigation of factors that might moderate effect size is warranted. As expected, results of the 
between-group homogeneity statistic indicated that within the less severe, preventive population, 
the mean for the deviant-only group (n = 20, M = 0.34) was significantly different from the mean 
for the non-aggregated group (n = 7, M = 0.67), Qb = 13.45, g < .05. However, no differences in 
effect size was observed for the more severe clinical population with respect to group format, Qb = 
0.00044, ns. The means were 0.59 and 0.77 for deviant-only (n = 7) and non-aggregated (n = 3) 
groups respectively. This pattern of results suggest that the less severe, preventive group appears to 
be more amenable to the effects of treatment format than are the more severe, clinical group. 

We also wanted to examine our hypothesis that treatment modality would moderate effect 
sizes for children (ages 6-12) but not for adolescents (ages 13-18). Because all studies with 
adolescent clients used deviant-only grouping, we were unable to fully examine this hypothesis. 
However, for the 24 studies conducted with children, the homogeneity statistic indicated 
statistically significant heterogeneity in effect sizes, Qt = 51.96, g < .05. The mean effect size for 
the 14 deviant-only group treatment studies (M_= 0.23) and for the 1 0 non-aggregated treatment 
studies (M = 0.70) were statistically significantly different, Qb = 20.58, g < .05. 

Discussion 

This study used meta-analytic techniques to summarize treatment outcomes associated with 
skills training interventions with antisocial youth. Based on both empirical and conceptual 
arguments, we expected effect sizes would vary systematically based on whether treatment was 
delivered in the context of aggregated groups of deviant youth versus non-aggregated group or 
individual treatment. Consistent with our hypothesis, skills training interventions delivered in the 
context of homogeneous groups of deviant peers produced smaller benefits than did skills training 
interventions delivered in the context of either individual treatment or mixed groups of prosocial 
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and deviant peers. This finding held for an analysis of the overall effect size for each study as well 
as for behavior observations, problem-solving skills, and socially consequential outcomes. Although 
differences between treatment modality groups in effect sizes for behavior ratings and self-report 
measures did not reach statistical significance, differences in effect sizes for these outcomes were 
also in the expected direction. 

It is encouraging that the largest effect sizes, irrespective of treatment modality, were found 
on measures that were socially consequential, defined in terms of outcomes that assess the impact of 
treatment on important developmental outcomes, such as peer-ratings of acceptance or aggression 
or recidivism. Although encouraging, these results should be interpreted cautiously based on the 
few number of studies that included such measures. 

Also, consistent with expectations, youth who are “at risk” for serious conditions, such as 
incarceration in a juvenile facility are more influenced by homogenous grouping than are youth who 
already experience these conditions. This finding suggests the importance of including prosocial 
children in interventions for youth at-risk for significant conduct problems. School-based 
interventions offer the possibility of providing skills training in the context of mixed groups of 
children with and without problems, as exemplified in the Prinz, Blechman, and Dumas (1994) 
study. These authors suggested that such mixed groups not only avoid the adverse outcomes 
associated with deviant peer groups but also engage “high risk children in a supportive, prosocial 
peer network” (p. 195). 

The results of this study must be interpreted in light of certain study limitations. Several 
potentially important moderator variables of the effectiveness of skills training could not be 
investigated due to limited information provided in published studies. For example, ethnic and 
gender differences in responsiveness to skills training interventions and to treatment modality are 
important to investigate, but too few investigators report results separately by gender and ethnicity 
to permit a determination of the role of gender and ethnicity in the relationship between treatment 
responsiveness and treatment modality. Similarly, subtypes of aggressive children, such as 
proactive and reactive aggressive children, would likely differ in their susceptibility to adverse 
outcomes associated with grouping deviant peers but could not be examined in this meta-analysis. 
We also could not determine if treatment modality moderates the effectiveness of skills training 
with adolescents, due to the absence of intervention studies with this population that utilize non- 
aggregated modalities. 

Approximately half of the studies (5 1 .4%) failed to report follow-up data. If bringing 
together deviant peers for purposes of skills training results in greater association with deviant peers 
beyond the duration of the training, the adverse effects of aggregating deviant peers may be greater 
at follow-up than immediately post-treatment. Dishion et al. (1995) found evidence for adverse 
effects of aggregating at-risk teens only at post-treatment. 

The small number of studies (n = 5) that utilized groups of mixed children required that we 
combine mixed group-based interventions with individually-provided interventions. These two 
formats both avoid aggregating deviant peers; however, they introduce a confound in that studies in 
the differences between aggregated and non-aggregated studies could be a result of differences in 
group versus individual treatment instead of aggregating deviant peers. However, our decision to 
combine these two types of non-aggregated skills training studies is supported by the finding that 
their effect sizes were comparable. 
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Table 1 

Effect Sizes by Treatment Modality 



Outcome 


Deviant-only 


Non-Aggregated 


All studies 
(B = 37) 




D 


n 


D 


n 


D 


n 


Overall 


.42 a 


27 


,69 a 


10 


.49 


37 


Behavior ratings 


.48 


19 


.59 


10 


.52 


29 


Behavior observations 


.39 a 


16 


.80 a 


3 


.45 


19 


Self report 


.25 


16 


.42 


4 


.29 


20 


Problem-solving skills 


.14 a 


11 


,66 a 


5 


.30 


16 


Socially consequential 


.95 a 


10 


1.34 a 


3 


1.04 


13 


measures 















Note, n in table refers to number of effect sizes for that comparison. Each study can 
contribute no more than one effect size for each outcome type. 
a Effect sizes differ based on treatment modality (p < .05). 
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